AITopics | online learning

Collaborating Authors

online learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dynamic Regret Reduces to Kernelized Static Regret

Neural Information Processing SystemsJun-23-2026, 04:09:46 GMT

We study dynamic regret in online convex optimization, where the objective is to achieve low cumulative loss relative to an arbitrary benchmark sequence. By observing that competing with an arbitrary sequence of comparators u1,...,uT in W Rd can be reframed as competing with a fixed comparator function u: [1,T] W, we cast dynamic regret minimization as a static regret problem in a function space. By carefully constructing a suitable function space in the form of a Reproducing Kernel Hilbert Space (RKHS), our reduction enables us to recover the optimal RT(u1,...,uT) = O( pP t ut ut 1 T) dynamic regret guarantee in the setting of linear losses, and yields new scale-free and directionallyadaptive dynamic regret guarantees. Moreover, unlike prior dynamic-to-static reductions--which are valid only for linear losses--our reduction holds for any sequence of losses, allowing us to recover O u 2H +deff(λ)lnT bounds when the losses have meaningful curvature, where deff(λ)is a measure of complexity of the RKHS. Despite working in an infinite-dimensional space, the resulting reduction leads to algorithms that are computable in practice, due to the reproducing property of RKHSs.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Online Learning of Neural Networks

Neural Information Processing SystemsJun-14-2026, 13:45:22 GMT

We study online learning of feedforward neural networks with the sign activation function that implement functions from the unit ball in Rd to a finite label set Y = {1,...,Y}. First, we characterize a margin condition that is sufficient and in some cases necessary for online learnability of a neural network: Every neuron in the first hidden layer classifies all instances with some margin γ bounded away from zero. Quantitatively, we prove that for any net, the optimal mistake bound is at most approximately TS(d,γ), which is the (d,γ)-totally-separablepacking number, a more restricted variation of the standard (d,γ)-packing number. We complement this result by constructing a net on which any learner makes TS(d,γ) many mistakes. We also give a quantitative lower bound of approximately TS(d,γ) max{1/(γ d)d,d} when γ 1/2, implying that for some nets and input sequences every learner will err for exp(d) many times, and that a dimension-free mistake bound is almost always impossible.

artificial intelligence, machine learning, neuron, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Computable universal online learning

Neural Information Processing SystemsJun-14-2026, 07:31:53 GMT

Understanding when learning is possible is a fundamental task in the theory of machine learning. However, many characterizations known from the literature deal with abstract learning as a mathematical object and ignore the crucial question: when can learning be implemented as a computer program? We address this question for universal online learning, a generalist theoretical model of online binary classification, recently characterized by Bousquet et al. (STOC 2021). In this model, there is no hypothesis fixed in advance; instead, Adversary--playing the role of Nature--can change their mind as long as local consistency with the given class of hypotheses is maintained. We require Learner to achieve a finite number of mistakes while using a strategy that can be implemented as a computer program. We show that universal online learning does not imply computable universal online learning, even if the class of hypotheses is relatively easy from a computability-theoretic perspective. We then study the agnostic variant of computable universal online learning and provide an exact characterization of classes that are learnable in this sense. We also consider a variant of proper universal online learning and show exactly when it is possible. Together, our results give a more realistic perspective on the existing theory of online binary classification and the related problem of inductive inference.

artificial intelligence, machine learning, universal online learning, (7 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data

Neural Information Processing SystemsJun-10-2026, 23:31:53 GMT

The multi-armed bandit (MAB) is a fundamental online decision-making framework that has been extensively studied over the past two decades. To mitigate the high cost and slow convergence of purely online learning, modern MAB approaches have explored paradigms that leverage offline data to warm-start online learning. However, existing approaches face a significant limitation by assuming that the offline and online data are homogeneous--they share the same feedback structure and are drawn from the same underlying distribution. This assumption is often violated in practice, where offline data often originate from diverse sources and evolving environments, resulting in feedback heterogeneity and distributional shifts. In this work, we tackle the challenge of learning across this offline-online gap by developing a general hybrid bandit framework that incorporates heterogeneous offline data to improve online performance. We study two hybrid settings: (1) using reward-based offline data to accelerate online learning in preference-based bandits (i.e., dueling bandits), and (2) using preference-based offline data to improve online standard MAB algorithms. For both settings, we design novel algorithms and derive tight regret bounds that match or improve upon existing benchmarks despite heterogeneity. Empirical evaluations on both synthetic and real-world datasets show that our proposed methods significantly outperform baseline algorithms.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Characterizing and Correcting Effective Target Shift in Online Learning

Li, Ziyan, Hiratani, Naoki

arXiv.org Machine LearningMay-11-2026

Online learning from a stream of data is a defining feature of intelligence, yet modern machine learning systems often struggle in this setting, especially under distributional shift. To understand its basic properties, we study the relationship between online and offline learning in the context of kernel regression. We derive a closed-form expression for the function learned by online kernel regression, revealing that online kernel regression is equivalent to offline regression with shifted, inaccurate target outputs. Conversely, we show that by compensating for this effective shift in the teaching signal through target correction, online kernel-based learning can provably learn the same predictor as its offline counterpart. We derive both a closed-form expression for this target correction and an iterative form that can be applied sequentially. Applying this framework to image classification tasks on CIFAR-10 and CORe50, we show that online stochastic gradient descent with iteratively corrected targets outperforms learning with the true targets in continual learning settings. This work therefore provides a basic framework for analyzing and improving online learning in non-stationary environments.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

2605.07886

Country: North America (0.46)

Genre: Research Report (0.64)

Industry:

Education > Educational Setting > Online (0.84)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Note on How to Remove the $\ln\ln T$ Term from the Squint Bound

Orabona, Francesco

arXiv.org Machine LearningApr-30-2026

In Orabona and Pál [2016], we introduced the shifted KT potentials, to remove the $\ln \ln T$ factor in the parameter-free learning with expert bound. In this short technical note, I show that this is equivalent to changing the prior in the Krichevsky--Trofimov algorithm. Then, I show how to use the same idea to remove the $\ln \ln T$ factor in the data-independent bound for the Squint algorithm.

algorithm, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2604.26926

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

Oracle-Efficient Online Learning for Smoothed Adversaries

Neural Information Processing SystemsApr-24-2026, 21:46:41 GMT

We study the design of computationally efficient online learning algorithms under smoothed analysis. In this setting, at every step an adversary generates a sample from an adaptively chosen distribution whose density is upper bounded by 1/ times the uniform density. Given access to an offline optimization (ERM) oracle, we give the first computationally efficient online algorithms whose sublinear regret depends only on the pseudo/VC dimension dof the class and the smoothness parameter .

algorithm, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Education > Educational Setting > Online (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.66)

Add feedback

Parameter-Free Online Learning via Model Selection

Neural Information Processing SystemsApr-23-2026, 15:39:59 GMT

We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning. Departing from previous work, which has focused on highly structured function classes such as nested balls in Hilbert space, we propose a generic meta-algorithm framework that achieves online model selection oracle inequalities under minimal structural assumptions. We give the first computationally efficient parameter-free algorithms that work in arbitrary Banach spaces under mild smoothness assumptions; previous results applied only to Hilbert spaces. We further derive new oracle inequalities for matrix classes, non-nested convex sets, and $\mathbb{R}^{d}$ with generic regularizers. Finally, we generalize these results by providing oracle inequalities for arbitrary non-linear classes in the online supervised learning model. These results are all derived through a unified meta-algorithm scheme using a novel multi-scale algorithm for prediction with expert advice based on random playout, which may be of independent interest.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback